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l§k Xin Jin, Rongyan Li, Xian Shen, Rongfang Bie 

llP March SAC '07: Proceedings of the 2007 ACM symposium on Applied computing 
2007 

Publisher: ACM 

Full text available:^) A ,, , t . , , ... 

j I Additional Information 

KB) 

A great challenge of web mining arises from the increasingly large web pages and 
the high dimensionality associated with natural language. Since classifying web 
pages of an interesting class is often the first step of mining the web, web page 
categorization/ classification ... 
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Tong Zhang, Alexandrin Popescul, Byron Dom 

August KDD '06: Proceedings of the 12th ACM SIGKDD international conference on 
2006 Knowledge discovery and data mining 

Publisher: ACM 

Full text available: « . ll( ■ . , « . ■ ■ , 

i I Additional Information 

KB).. 

We present a risk minimization formulation for learning from both text and graph 
structures which is motivated by the problem of collective inference for hypertext 
document categorization. The method is based on graph regularization formulated as 
a well-formed ... 

Keywords: collective inference, document classification, graph and relational 
learning, regularization, semi-supervised learning 



3 W.v N N N earning methods for Chinese web page categorization 

Ji He, Ah-Hwee Tan, Chew-Lim Tan 

October Proceedings of the second workshop on Chinese language 

2000 processing: held in conjunction with the 38th Annual Meeting of the 

Association for Computational Linguistics - Volume 12, Volume 12 
Publisher: Association for Computational Linguistics 

Full text available: WlM 06.21 ...... .. , , . . , . .. 

L_T Additional Information. 

KB) 

This paper reports our evaluation of k Nearest Neighbor (kNN), Support Vector 
Machines (SVM), and Adaptive Resonance Associative Map (ARAM) on Chinese web 
page classification. Benchmark experiments based on a Chinese web corpus showed 
that their ... 
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Dou Shen, Zheng Chen, Qiang Yang, Hua-Jun Zeng, Benyu Zhang, Yuchang Lu, Wei-Ying 
Ma 

July SI Gl R '04: Proceedings of the 27th annual international ACM SIGI R conference 
2004 on Research and development in information retrieval 
Publisher: ACM 

Full text available: «g ; ••• f Additional Information: u CvV,. i, N , ->o^\ to \, 

KB}.. Mex .terms, reylevv 

Web-page classification is much more difficult than pure-text classification due to a 
large variety of noisy information embedded in Web pages. In this paper, we propose 
a new Web-page classification algorithm based on Web summarization for 
improving ... 

Keywords: content body, web page categorization, web page summarization 
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loannis Anagnostopoulos, Photis Stavropoulos, Georgios Kouzas, Christos 
Anagnostopoulos, Dimitrios D. Vergados 

July I CWE '06: Workshop proceedings of the sixth international conference on Web 
2006 engineering 
Publisher: ACM 

Full text available: ^odfi ii ?.;3 ...... .. . 4 . , ., . . , « . , , 

i I Additional Information- 

KB).. 

This paper proposes a statistical approach for estimating the evolution of categorized 
web pages. The proposal is based on the capture-recapture method used in wildlife 
biological studies and it is modified according to the necessary assumptions and ... 

Keywords: capture-recapture measurements, web evolution, web page 
categorization 



6 Knowing a web page by the company It keeps 

lik Xiaoguang Qi, Brian D. Davison 

^jr November CI KM '06: Proceedings of the 15th ACM international conference on 
2006 Information and knowledge management 

Publisher: ACM 



Full text available: ^j. 
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Web page classification is important to many tasks in information retrieval and web 
mining. However, applying traditional textual classifiers on web data often produces 
unsatisfying results. Fortunately, hyperlink information provides important clues ... 

Keywords: SVM, neighboring, rainbow, web page classification 
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March SAC '05: Proceedings of the 2005 ACM symposium on Applied computing 
2005 

Publisher: ACM 

Full text available: « pdfsi 06.61 ...... ... t , , , . 

LJ Additional Information: 

KB) 

Classifying text into predefined categories is a fundamental task in information 
retrieval (I R). I R and web mining techniques have been applied to categorize web 
pages to enable users to manage and use the huge amount of information available 
on the ... 

Keywords: dissimilarity measure, n-grams, text categorization, web contents, web 
mining 
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B. L. Narayan, C. A. Murthy, Sankar K. Pal 

October Wl '03: Proceedings of the 2003 IEEE/WIC International Conference on 

2003 Web Intelligence 

Publisher: IEEE Computer Society 

Full text available: iHH v N ■ v 

HMi 1 Additional Information . • . • 

Site. 

PageRank is primarily based on link structure analysis. Recently, it has been shown 
that content information can be utilized to improve link analysis. We propose a novel 
algorithm that harnesses the information contained in the history of a surfer to ... 



[nteo st-based personalized search 

Zhongming Ma, Gautam Pant, Olivia R. Liu Sheng 

February ACM Transactions on I nform at ion System s (TO I S) , Volume 25 Issue 
2007 1 
Publisher: ACM 

Full text available: «Q o-r. % 



Additional Information: fi 



Web search engines typically provide search results without considering user 
interests or context. We propose a personalized search approach that can easily 
extend a conventional search engine on the client side. Our mapping framework 
automatically maps ... 

Keywords: Open Directory, Personalized search, World Wide Web, information 
retrieval, user interest, user interface 



10 Joint categorization of queries a nd clips for web-based video search 

*s, Ruofei Zhang, Ramesh Sarukkai, Jyh-Herng Chow, Wei Dai, Zhongfei Zhang 
'^F October Ml R '06: Proceedings of the 8th ACM international workshop on Multimedia 
2006 information retrieval 

Publisher: ACM 

Full text available* . ■ ■ ■ . ■ ... < . « , 

L_J Additional Information 

KB).. 

Building a video search engine on the Web is a very challenging problem. Compared 
with web page search, video search has its unique characteristics (such as high 
volume of data for each video, existence of multi-modal information including meta- 
data, ... 

Keywords: experiment, multi-modality based categorization, query categorization, 
video categorization, web-based video search 
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Luis Gravano, Vasileios Hatzivassiloglou, Richard Lichtenstein 

November CI KM '03: Proceedings of the twelfth international conference on 

2003 Information and knowledge management 

Publisher: ACM 

Full text available ^ 'A Additional Information: - , 

KB)., jnde* terms 

Web pages (and resources, in general) can be characterized according to their 
geographical locality. For example, a web page with general information about 
wildflowers could be considered a global page, likely to be of interest to a 
geographically ... 

Keywords: information retrieval, query classification, query modification, search 
engines, web search 



Jianping Zhang, Jason Qin, Qiuming Yan 

December Wl '06: Proceedings of the 2006 I EEE/WI C/ACM International Conference 
2006 on Web Intelligence 

Publisher: IEEE Computer Society 

Full text available: uH;:^" ;v , . , „ . . 

L±T — Additional Information: : : ::: ■ U>-;:.v:, iv.-iiVi-: 

KB) 

By analyzing a set of access attempts by teenagers to pornographic websites, we 
found that more than half of them are image searches and visits to websites with 
little text information. It is obvious that textual content-based filters cannot 
correctly ... 
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Jiahui Liu, Larry Birnbaum 

November Wl '07: Proceedings of the IEEE/WIC/ACM International Conference on 
2007 Web Intelligence 

Publisher: IEEE Computer Society 

Full text available: 4f\pdf{525.06 . . . v ... , , , v 

i I Additional Information: 

KB) 

The importance of named entities in information retrieval and knowledge 
management has recently brought interest in characterizing semantic relationships 
between entities. In this paper, we propose a method for measuring semantic 
similarity, an important ... 
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Robert Wetzker, Tansu Alpcan, Christian Bauckhage, Winfried Umbrath, Sahin Albayrak 
November Wl '07: Proceedings of the I EEE/WIC/ACM International Conference on 
2007 Web Intelligence 

Publisher: IEEE Computer Society 

Full text available: «Vx.ifi'40i .04 . „ 

IM Additional Information: 

KB).. 

We propose a hierarchical approach to document categorization that requires no pre- 
configuration and maps the semantic document space to a predefined taxonomy. The 
utilization of search engines to train a hierarchical classifier makes our approach 
more ... 
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B. L. Narayan, Sankar K. Pal 

September Wl '04: Proceedings of the 2004 I EEE/WI C/ACM International 

2004 Conference on Web Intelligence 

Publisher: IEEE Computer Society 

Full text available:^ pdfi'63.57 KB) m Publis her 

i — I W Additional Information: 

Site 



The research activities of the WIC-lndia Research Center include topics like 
improving the performance of search engines, link and neighborhood analysis, as 
well as, surfer modeling for ranking and categorization of web pages, and query 
answering. In ... 
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Mark A. Rosso 

'^F June JCDL '05: Proceedings of the 5th ACM/ 1 EEE- CS joint conference on Digital 
2005 libraries 
Publisher: ACM 

Full text available:^ A ... t . M . t . , . , , 

LJ Additional Information: 

KB) 

Many have suggested the use of genres to ameliorate the problem of web search, e. 
g. [1 ,3,4,5,6,7]. A central issue in the implementation of this idea is the choice of 
genres to be used as web page descriptors. Several studies have explored user 
terminology ... 

Keywords: classification, genre, metadata, web search 
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Rahul Singh, Ya-Wen Hsu 

September MULTI MEDI A '07: Proceedings of the 1 5th international conference on 
2007 Multimedia 
Publisher: ACM 

Full text available: ^"j . ,,. t , ll( ■ . , « . ■ ■ , 

i I Additional Information 

KB).. 

With the rapid growth in the volume, complexity, and heterogeneity of information in 
the World Wide Web (WWW), the role of user-data interaction paradigms is 
becoming increasingly critical to the success of web-based information retrieval and 
assimilation. ... 
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Koji Nakahira, Toshihiko Yamasaki, Kiyoharu Aizawa 

May WWW '05: Special interest tracks and posters of the 14th international 
2005 conference on World Wide Web 
Publisher: ACM 

Full text available: f» pdf(299.81 .... t . ... , ., 

LJ Additional Information: 

KB) 

We propose a function-oriented classification of web images and show new 
applications using this categorization. We defined nine categories of images taking 
into account of their functions used in web pages, and classified web images by using 
Support ... 

Keywords: classification, support vector machine, web images 
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A clustering 

Christopher H. Brooks, Nancy Montanez 

May WWW '06: Proceedings of the 15th international conference on World Wide 
2006 Web 
Publisher: ACM 

Full text available: « pdf(28i .51 ...... ... t . , „ , v ... , , .. 

LJ Additional Information. 

KB) 

Tags have recently become popular as a means of annotating and organizing Web 
pages and blog entries. Advocates of tagging argue that the use of tags produces a 
'folksonomy', a system in which the meaning of a tag is determined by its use among 
the community ... 

Keywords: automated annotation, blogs, hierarchical clustering, tagging 
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<^ Balachander Krishnamurthy, Craig E. Wills 

>P May WWW '02: Proceedings of the 11th international conference on World Wide 
2002 Web 
Publisher: ACM 

Full text available ^ ] 3 Additional Information: - , > , , ^ ^, <. •> \, 

KP{ index terms 

We categorize the set of clients communicating with a server on the Web based on 
information that can be determined by the server. The Web server uses the 
information to direct tailored actions. Users with poor connectivity may choose not to 
stay at ... 

Keywords: client characterization, client connectivity, server adaptation 



Results 1 - 20 of 1 ,952 Result page: 1 2 3 4 5 8 7. 8 9 10 next. 

The ACM 

Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc. 



Useful downloads: 



http://portal.acm.org/results.c£m?coll=ACM&dl=ACM&CFID=61631691&CFTOKEN=18697204 (8 of 8)3/30/08 4:47:43 PM 



